Active learning for constrained regression using kernel beta regression models

نویسندگان

  • Luis Montesano
  • Manuel Lopes
چکیده

In this poster we study active learning for supervised regression algorithms. We focus on a particular problem where the regression is constrained to an interval. This is the case, for instance, when the objective is to estimate the probability of success of a certain event given a set of input points. The regression problem is, thus, constrained to the zero-one interval and the training examples are also discrete, e.g. success or failure. The constrained regression problem is usually solved using logistic regression [1]. This is in essence a generalized linear model for discrete outputs (e.g. classes) based on the logits of the probabilities. Active strategies for this type of techniques have been studied in [2] and pointed out a trade off between computational cost for experimental design based techniques and robustness issues that appear in more heuristic criteria. A support vector machine version of the logistic regression, the kernel logistic regression [3] uses the log-likelihood of the binomial distribution as the loss function and provides directly estimates of the probability. Constrained regression has also been studied using Beta models in economics [4], to model rates and proportions [5] and in psychological studies to account for skew and heteroscedasticity [6]. In this case, a parametric model is fitted maximizing the likelihood function using, for instance, Newton-Raphson or Fisher scoring. Better experimental results have been reported using alternative residuals in [7]. Bayesian versions of this regressors have been recently proposed based on a hierarchical model and priors on the parameters [8]. This poster proposes a new algorithm for such a constrained regression problem specially suited for the active learning framework. As logisitc regression, the algorithm uses a Binomial likelihood model. At each point of the input space, a conjugate Binomial-Beta model provides the distribution of the probability of success. The two parameters of the Beta at a particular point x∗ are computed by accumulating evidence of successful and failed events at training points xi using a kernel K(x∗,xi) The expression for the Beta distribution is p(p∗ | x∗,Xn,Yn) ∝ Q n i=0 Bin(S∗i; p∗, S∗i + U∗i)Be(p∗;α0, β0) = Be p∗; n

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search

In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...

متن کامل

Learning Sequence Kernels

Kernel methods are used to tackle a variety of learning tasks including classification, regression, ranking, clustering, and dimensionality reduction. The appropriate choice of a kernel is often left to the user. But, poor selections may lead to a sub-optimal performance. Instead, sample points can be used to learn a kernel function appropriate for the task by selecting one out of a family of k...

متن کامل

Nonparametric Regression Estimation under Kernel Polynomial Model for Unstructured Data

The nonparametric estimation(NE) of kernel polynomial regression (KPR) model is a powerful tool to visually depict the effect of covariates on response variable, when there exist unstructured and heterogeneous data. In this paper we introduce KPR model that is the mixture of nonparametric regression models with bootstrap algorithm, which is considered in a heterogeneous and unstructured framewo...

متن کامل

Multiple Kernel Learning for Support Vector Regression ∗

Kernel support vector (SV) regression has successfully been used for prediction of nonlinear and complicated data. However, like other kernel methods such as support vector machine (SVM) classification, the quality of SV regression depends on proper choice of kernel functions and their parameters. Kernel selection for model selection is conventionally performed through repeated cross validation...

متن کامل

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009